MedLDA: maximum margin supervised topic models
نویسندگان
چکیده
A supervised topic model can use side information such as ratings or labels associated with documents or images to discover more predictive low dimensional topical representations of the data. However, existing supervised topic models predominantly employ likelihood-driven objective functions for learning and inference, leaving the popular and potentially powerful max-margin principle unexploited for seeking predictive representations of data and more discriminative topic bases for the corpus. In this paper, we propose the maximum entropy discrimination latent Dirichlet allocation (MedLDA) model, which integrates the mechanism behind the max-margin prediction models (e.g., SVMs) with the mechanism behind the hierarchical Bayesian topic models (e.g., LDA) under a unified constrained optimization framework, and yields latent topical representations that are more discriminative and more suitable for prediction tasks such as document classification or regression. The principle underlying the MedLDA formalism is quite general and can be applied for jointly max-margin and maximum likelihood learning of directed or undirected topic models when supervising side information is available. Efficient variational methods for posterior inference and parameter estimation are derived and extensive empirical studies on several real data sets are also provided. Our experimental results demonstrate qualitatively and quantitatively that MedLDA could: 1) discover sparse and highly discriminative topical representations; 2) achieve state of the art prediction performance; and 3) be more efficient than existing supervised topic models, especially for classification.
منابع مشابه
ERD-MedLDA: Entity relation detection using supervised topic models with maximum margin learning
This paper proposes a novel application of topic models to do entity relation detection (ERD). In order to make use of the latent semantics of text, we formulate the task of relation detection as a topic modeling problem. The motivation is to find underlying topics that are indicative of relations between named entities (NEs). Our approach considers pairs of NEs and features associated with the...
متن کاملMonte Carlo Methods for Maximum Margin Supervised Topic Models
An effective strategy to exploit the supervising side information for discovering predictive topic representations is to impose discriminative constraints induced by such information on the posterior distributions under a topic model. This strategy has been adopted by a number of supervised topic models, such as MedLDA, which employs max-margin posterior constraints. However, unlike the likelih...
متن کاملA Combination of Topic Models with Max-margin Learning for Relation Detection
This paper proposes a novel application of a supervised topic model to do entity relation detection (ERD). We adapt Maximum Entropy Discriminant Latent Dirichlet Allocation (MEDLDA) with mixed membership for relation detection. The ERD task is reformulated to fit into the topic modeling framework. Our approach combines the benefits of both, maximum-likelihood estimation (MLE) and max-margin est...
متن کاملMax-margin Latent Dirichlet Allocation for Image Classification and Annotation
Much work in image classification and labeling uses topic models (e.g. LDA [1]), which are a class of powerful tools originally proposed in text modeling and have gained much popularity in computer vision recently. Despite the success of topic models in visual recognition, we believe there are some limitations of the way that topic models are used in computer vision. First of all, most topic mo...
متن کاملInvestigating Time-sensitive Topic Model Approaches for Action Recognition
In this paper, we present several attempts of using topic models for action recognition in videos. We show that time-sensitive topic models help recognizing actions when little training data is available. We also exhibit some limitations of these models when dealing with complex videos. New applications of these models in semi-supervised settings and the use of inherently discrimant models such...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of Machine Learning Research
دوره 13 شماره
صفحات -
تاریخ انتشار 2012